Sample Size Determination for Survival Endpoint
Power for TTE related to several choices:
Power driven primarily by number of events (E) not sample size (N):
Calculating E separate from N:
Accrual/Follow-up
Survival Distribution/Effect Size
Other Consideration:
Reference
Note:
We assume that a study is to be made comparing the survival (or healing) of a control group with an experimental group. The control group (group 1) consists of patients that will receive the existing treatment. In cases where no existing treatment exists, the group 1 consists of patients that will receive a placebo. The experimental group (group 2) will receive the new treatment. We assume that the critical event of interest is death and that two treatments have survival distributions with instantaneous death (hazard) rates, 𝜆1 and 𝜆2. These hazard rates are a subject’s probability of death in a short period of time.
There are several ways to compare two hazard rates. One is the difference, \(\lambda_2-\lambda_1\). Another is the ratio, \(\lambda_2 / \lambda_1\), called the hazard ratio. \[ H R=\frac{\lambda_2}{\lambda_1} \] Note that since HR is formed by dividing the hazard rate of the experimental group by that of the control group, a treatment that has a smaller hazard rate than the control will have a hazard ratio that is less than one.
The hazard ratio may be formulated in other ways. If the proportions surviving during the study are called \(S 1\) and \(S 2\) for the control and experimental groups, the hazard ratio is given by \[ H R=\frac{\log \left(S_2\right)}{\log \left(S_1\right)} \] Furthermore, if the median survival times of the two groups are \(M 1\) and \(M 2\), the hazard ratio is given by \[ H R=\frac{M_1}{M_2} \]
We assume that the logrank test will be used to analyze the data once they are collected. However, often Cox’s proportional hazards regression is used to do the actual analysis. The power calculations of the logrank test are based on several other parameters \[ z_{1-\beta}=\frac{|H R-1| \sqrt{N(1-w) \varphi\left[\left(1-S_1\right)+\varphi\left(1-S_2\right)\right] /(1+\varphi)}}{(1+\varphi H R)}-z_{1-\alpha / k} \] where \(k\) is 1 for a one-sided hypothesis test or 2 for a two-sided test, \(\alpha\) and \(\beta\) are the error rates defined as usual, the \(z^{\prime}\) s are the usual points from the standard normal distribution, \(w\) is the proportion that are lost to follow up, and \(\varphi\) represents the sample size ratio between the two groups. \[ \varphi=\frac{N_2}{N_1} \] Note that the null hypothesis is that the hazard ratio is one, i.e., that \[ H_0: \frac{\lambda_2}{\lambda_1}=1 \]
ssc.logRank.Freedman <- function(S.trt, S.ctrl, sig.level = 0.05, power = 0.8,
alternative = c("two.sided", "less", "greater"),
method = c("Freedman"),
pr=TRUE) {
# FIXME: Relabel S.trt and S.ctrl as S.ctrl and S.trt
alt <- match.arg(alternative)
za <- if (alt == "two.sided") {
stats::qnorm(sig.level / 2)
} else {
stats::qnorm(sig.level)
}
zb <- stats::qnorm(1 - power)
haz.ratio <- log(S.trt) / log(S.ctrl)
if(pr)
cat("\nHazard ratio:",format(haz.ratio),"\n")
cat("Expected number of events:", 4 * (za + zb) ^ 2 / log(1 / haz.ratio) ^ 2)
cat("\n")
(((haz.ratio + 1) / (haz.ratio - 1)) ^ 2) *
(za + zb) ^ 2 / (2 - S.trt - S.ctrl)
}
ssc.logRank.Freedman(0.5,0.7,power = 0.817)
##
## Hazard ratio: 1.943358
## Expected number of events: 74.32079
## [1] 99.81032
## HR
## log(0.7) / log(0.5)
Using an unstratified log-rank test at the one-sided 2.5% significance level, a total of 282 events would allow 92.6% power to demonstrate a 33% risk reduction (hazard ratio for RAD/placebo of about 0.67, as calculated from an anticipated 50% increase in median PFS, from 6 months in placebo arm to 9 months in the RAD001 arm).
With a uniform accrual of approximately 23 patients per month over 74 weeks and a minimum follow up of 39 weeks, a total of 352 patients would be required to obtain 282 PFS events, assuming an exponential progression-free survival distribution with a median of 6 months in the Placebo arm and of 9 months in RAD001 arm. With an estimated 10% lost to follow up patients, a total sample size of 392 patients should be randomized.
Yao JC, Shah MH, Ito T, Bohas CL, Wolin EM, Van Cutsem E, Hobday TJ, Okusaka T, Capdevila J, de Vries EG, Tomassetti P, Pavel ME, Hoosen S, Haas T, Lincy J, Lebwohl D, Öberg K; RAD001 in Advanced Neuroendocrine Tumors, Third Trial (RADIANT-3) Study Group. Everolimus for advanced pancreatic neuroendocrine tumors. N Engl J Med. 2011 Feb 10;364(6):514-23. doi: 10.1056/NEJMoa1009290. PMID: 21306238; PMCID: PMC4208619. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4208619/
| Significance Level (1-Sided) | 0.025 |
|---|---|
| Placebo Median Survival (months) | 6 |
| Everolimus Median Survival (months) | 9 |
| Hazard Ratio | 0.66667 |
| Accrual Period (Weeks) | 74 |
| Minimum Follow-Up (Weeks) | 39 |
| Power % (under constant HR) | 92.6 |
# Load required library
library(powerSurvEpi)
# Define parameters
power <- 0.926
alpha <- 0.025 # One-sided significance level
k <- 74 / 52 # Accrual time in years
followup_time <- 39 / 52 # Minimum follow-up time in years
median_survival_placebo <- 6 / 12 # Median survival in years for placebo
median_survival_treatment <- 9 / 12 # Median survival in years for treatment
dropout_rate <- 0.10
# Calculate event rates assuming exponential survival
# Event Rates (pC, pE)**: These are derived from the median survival times assuming an exponential survival model.
pC <- log(2) / median_survival_placebo # Event rate for placebo
pE <- log(2) / median_survival_treatment # Event rate for treatment
RR <- 0.66667 # Risk reduction as hazard ratio
# Calculate the required sample size
sample_size <- ssizeCT.default(power = power,
k = k + followup_time, # Total study duration including follow-up
pE = pE,
pC = pC,
RR = RR,
alpha = alpha)
# Adjust for dropout
final_sample_size <- sample_size / (1 - dropout_rate)
final_sample_size <- ceiling(final_sample_size) # Round up to the next whole number
# Output the calculated sample size
print(paste("Total sample size needed, accounting for dropout:", final_sample_size))
## [1] "Total sample size needed, accounting for dropout: 242"
## [2] "Total sample size needed, accounting for dropout: 112"
proc power;
twosamplelogrank
test=logrank
groupmedians = (6 9) /* Median survival times in months */
hazardratio = 0.66667 /* Hazard Ratio */
accrualtime = 74/52 /* Accrual time in years */
followuptime = 39/52 /* Follow-up time in years */
sides = 1 /* One-sided test */
power = 0.926 /* Desired power */
alpha = 0.025 /* One-sided significance level */
ntotal = . /* Let SAS calculate required total sample size */
groupweights = (1 1); /* Equal weighting of groups */
run;
Reference
Introduction
One reason of log-rank tests are useful is that they provide an objective criteria (statistical significance) around which to plan out a study:
In survival analysis, we need to specify information regarding the censoring mechanism and the particular survival distributions in the null and alternative hypotheses.
We shall assume that the patients enter a trial over a certain accrual period of length \(a\), and then followed for an additional period of time \(f\) known as the follow-up time. Patients still alive at the end of follow-up are censored.
Exponential Approximation
In general, it is assumed we have constant hazards (i.e., exponential distributions) for the sake of simplicity. Because other work in literature has indicated that the power/sample size obtained from assuming constant hazards is fairly close to the empirical power of the log-rank test, provided that the ratio between the two hazard functions is constant. Typically in a power analysis, we are simply trying to find the approximate number of subjects required by the study, and many approximations/guesses are involved, so using formulas based on the exponential distribution is often good enough.
Reference
Assumes exponential distributions for both treatment groups. Uses the George-Desu method along with formulas of Schoenfeld that allow estimation of the expected number of events in the two groups. To allow for drop-ins (noncompliance to control therapy, crossover to intervention) and noncompliance of the intervention, the method of Lachin and Foulkes is used.
For handling noncompliance, uses a modification of formula (5.4) of Lachin and Foulkes. Their method is based on a test for the difference in two hazard rates, whereas cpower is based on testing the difference in two log hazards. It is assumed here that the same correction factor can be approximately applied to the log hazard ratio as Lachin and Foulkes applied to the hazard difference.
Note that Schoenfeld approximates the variance of the log hazard ratio by 4/m, where m is the total number of events, whereas the George-Desu method uses the slightly better 1/m1 + 1/m2. Power from this function will thus differ slightly from that obtained with the SAS samsizc program.
##
## Accrual duration: 1.5 years Minimum follow-up: 5 years
##
## Total sample size: 950
##
## Alpha= 0.05
##
## 5-year Mortalities (Events Rate)
## Control Intervention
## 0.18 0.10
##
## Hazard Rates
## Control Intervention
## 0.03969019 0.02107210
##
## Probabilities of an Event During Study
## Control Intervention
## 0.2039322 0.1140750
##
## Expected Number of Events
## Control Intervention
## 96.9 54.2
##
## Hazard ratio: 0.5309147
##
## Drop-in rate (controls):10%
## Non-adherence rate (intervention):15%
## Effective hazard ratio with non-compliance: 0.6219687
## Standard deviation of log hazard ratio: 0.1696421
## Approximation method of variance of the log hazard ratio based on Peterson B, George SL: Controlled Clinical Trials 14:511–522; 1993.
##
## Power
## 0.7993381
##
## Accrual duration: 1.5 years Minimum follow-up: 5 years
##
## Total sample size: 950
##
## Alpha= 0.05
##
## 5-year Mortalities (Events Rate)
## Control Intervention
## 0.18 0.10
##
## Hazard Rates
## Control Intervention
## 0.03969019 0.02107210
##
## Probabilities of an Event During Study
## Control Intervention
## 0.2039322 0.1140750
##
## Expected Number of Events
## Control Intervention
## 91.8 57.0
##
## Hazard ratio: 0.5309147
##
## Drop-in rate (controls):10%
## Non-adherence rate (intervention):15%
## Effective hazard ratio with non-compliance: 0.6219687
## Standard deviation of log hazard ratio: 0.1639526
## Approximation method of variance of the log hazard ratio based on Schoenfeld D: Biometrics 39:499–503; 1983.
##
## Power
## 0.8254654
Patients will be accrued uniformly over two years and then followed for an additional three years past the accrual period. Some loss to follow-up is expected, with roughly exponential rates that would result in about 50% loss with the standard treatment within 10 years. The loss to follow-up with the proposed treatment is more difficult to predict, but 50% loss would be expected to occur sometime between years 5 and 20.
## time at event estimated = 2
## duration of accrual period = 2
## minimum follow-up time = 3
## Standard treatment: 50% loss with the standard treatment within 10 years
## Proposed treatment: 50% loss would be expected to occur sometime between years 5 and 20
## The "Standard" curve specifying an exponential form with a survival probability of 0.5 at year 5.
## The "Proposed" curve is a piecewise linear curve defined by the five points shown
proc power;
twosamplesurvival test= logrank
accrualtime=2
followuptime=3
power = 0.8
alpha = 0.05
sides = 2
curve("Standard") = 5 : 0.5
curve("Proposed") = (1 to 5 by 1):(0.95 0.9 0.75 0.7 0.6)
groupsurvival = "Standard" | "Proposed"
groupmedlosstimes = 10 | 20 5
npergroup = .;
run;
data __NULL_;
HR1 = -log(1-0.3);
HR2a = -log(1-0.45);
HR2b = -log(1-0.5);
put HR1 HR2a HR2b;
run;
proc power;
twosamplesurvival test=logrank
/* Specify Analysis Information */
accrualtime=2
followuptime=3
power = 0.8
alpha = 0.05
sides = 2
/* Specify Effects */
gexphs= 0.3567 | 0.5978 .6931
groupweights = (2 1)
/* Specify Loss Information */
grouplossexphazards=(0.3567 0.3567)
ntotal= .;
plot y=power min=0.5 max=0.90;
run;
Clinical trial to assess new treatment for patients with chronic active hepatitis.
Calculation
proc power;
twosamplesurvival test=logrank
/* Specify Analysis Information */
followuptime = 5
totalTIME = 5
power = 0.8
alpha = 0.05
sides = 2
/* Specify Effects */
hazardratio = 0.57
refsurvexphazard=0.178
ntotal = . ;
run;
proc power;
twosamplesurvival
test=logrank
curve("Control") = (0 5):(1 0.8)
curve("Treatment") = (0 5):(1 0.85)
refsurvival = "Control"
accrualtime = 2.5
followuptime = 2.5
hazardratio = 1.373
alpha = 0.05
sides = 2
ntotal = .
power = 0.8;
run;
Piecewise linear survival curve
proc power;
twosamplesurvival test=logrank
curve("Existing Treatment") = 5 : 0.5
curve("Proposed Treatment") = 1 : 0.95 2 : 0.90 3:0.75 4:0.70 5:0.60
groupsurvival = "Existing Treatment" | "Proposed Treatment"
accrualtime = 2
FOLLOWUPTIME = 3
power = 0.80
alpha=0.05
npergroup = . ;
run;
Group sequential design with interim analyses
the survival probability at 12 months are for standard and proposed groups are specified the statement of grouplossexphazards is used to account for the dropout rate.
proc power;
twosamplesurvival test=logrank
curve("Standard") = 12 : 0.8781
curve("Proposed") = 12 : 0.9012
groupsurvival = "Standard" | "Proposed"
accrualtime = 18
Totaltime = 24
GROUPLOSSEXPHAZARDS = (0.0012 0.0012)
NSUBINTERVAL = 1
power = 0.85
ntotal = . ;
run;
Reference
This procedure is based on the formulas presented in Pintilie (2006) and Machin et al. (2009), which are both based on the original paper Pintilie (2002).
Introduction
Logrank test is used to compare the two survival distributions because it is easy to apply and is usually more powerful than an analysis based simply on proportions. It compares survival across the whole spectrum of time, not at just one or two points, and accounts for censoring.
When analyzing time-to-event data and calculating power and sample size, a complication arises when individuals in the study die from risk factors that are not directly related to the risk factor of interest. For example, a researcher may wish to determine if a new drug for some disease improves patient survival time when compared to a standard treatment. Therefore, the researchers would be interested to know how long each patient lives until he or she dies from the disease. However, during the course of the study, patients may also die from other risks such as myocardial infarction, diabetes, or even an accident. When a patient dies from one of these other risk factors, then the main event of interest cannot be observed, so the true time-to-event of the disease for that patient can never be determined.
Power Overestimated
If the results are not adjusted, then the power calculated for the logrank test of the main event of interest may be grossly overestimated, depending on the incidence of competing risks
Assumptions
The power and sample size calculations in the module for the logrank test are based on the following assumptions:
Details
The hazard rates for the event of interest and competing risks in group \(i\) are calculated from the cumulative survival functions as \[ \begin{aligned} & h_{e v, i}=\left(\frac{-\ln \left(S_{e v, i}(T 0)\right)}{T 0}\right) \\ & h_{c r, i}=\left(\frac{-\ln \left(S_{c r, i}(T 0)\right)}{T 0}\right) \end{aligned} \] The hazard ratio used in power calculations is calculated from the hazard rates for the event of interest as \[ H R=\left(\frac{h_{e v, 2}}{h_{e v, 1}}\right) \] the hazard rate for the treatment group divided by the hazard rate for the control group. The hazard rates may be calculated using cumulative survival proportions or cumulative incidences as described above.
Then we can calculate Probability of Event and Number of Event
Probability of Event
With the hazard rates for the event of interest and competing risks, the probability of observing the event of interest in a subject in group \(i, P r_{e v, i}\), is given as \[ P r_{e v, i}=\frac{h_{e v, i}}{h_{e v, i}+h_{c r, i}}\left(1-\frac{\exp \left\{-(T-R) \times\left(h_{e v, i}+h_{c r, i}\right)\right\}-\exp \left\{-T \times\left(h_{e v, i}+h_{c r, i}\right)\right\}}{R \times\left(h_{e v, i}+h_{c r, i}\right)}\right), \] where \(T\) is the total time of trial and \(R\) is the accrual time. The follow-up time is calculated from \(T\) and \(R\) as \[ \text { Follow-Up Time }=T-R \text {. } \] The overall probability of observing the event of interest during the study in both groups is given as \[ P r_{e v}=p_1 P r_{e v, 1}+\left(1-p_1\right) P r_{e v, 2} \] where \(p_1\) is the proportion of subjects in group 1 , the control group.
Number of Events
When dealing with time-to-event data, it is the number of events observed, not the total number of subjects that is important to achieve the specified power. The total required number of events (for the event of interest), \(E\), is calculated from the total sample size \(N\) and \(P r_{e v}\) as \[ E=N \times P r_{e v} \] The number of events in group \(i\) is calculated as \[ E_i=n_i \times P r_{e v, i} \] where \(n_i\) is the sample size for the \(i^{i \text { th }}\) group.
Power and Sample Size Calculations
Assuming an exponential model and independence of failure times for the event of interest and competing risks, Pintilie (2006) gives the following equation relating E (total number of events for the risk factor of interest) and power:
\[ z_{1-\beta}=\sqrt{E \times p_1\left(1-p_1\right)} \log (H R)-z_{1-\alpha / 2} \] with
This power formula indicates that it is the total number of events observed, not the number of subjects that is critical for achieving the desired power for the logrank test.
The power formula can be rearranged to solve for \(E\), the total number of events required. The formula is \[ E=\left(\frac{1}{p_1\left(1-p_1\right)}\right) \times\left(\frac{z_{1-\alpha / 2}+z_{1-\beta}}{\log (H R)}\right)^2 . \] The overall sample size can be computed from \(E\) and \(P r_{e v}\) as \[ N=\frac{E}{P r_{e v}}=\left(\frac{1}{p_1\left(1-p_1\right) \times P r_{e v}}\right) \times\left(\frac{z_{1-\alpha / 2}+z_{1-\beta}}{\log (H R)}\right)^2 . \] The individual group sample sizes are calculated as \[ \begin{aligned} & n_1=N \times p_1, \\ & n_2=N \times\left(1-p_1\right), \end{aligned} \] where \(p_1\) is the proportion of subjects in group 1 , the control group.
Alternative Hypothesis: Two-Sided
Alpha: 0.05
R (Accrual Time): 3
T-R (Follow-Up Time): 2
T0 (Fixed Time Point): 3
Sev1(T0) (Control): 0.5
HR (Hazard Ratio = hev2 / hev1): 0.5
Scr1(T0) (Control): 0.4
Percent in Group 1: 50
Power: 0.6162274
Total Power (N): 150
##
## Sample Size Calculation using Logrank Tests Accounting for Competing Risks
## Alpha 0.025
## Power 61.62274 %
##
## Accrual time of survival rate observed: 3 years
## Total time of tria: 5 years
## Follow-Up Time: 2 years
##
## Survival probability for the event of interest in group 1: 0.5
## Survival probability for the event of interest in group 2: 0.7071068
## Hazard Ration: 0.5
##
## Competing risks probability: 0.4
##
## Proportion of subjects in group 1: 0.5
## Proportion of subjects in group 2: 0.5
##
## The probability of observing the event of interest in a subject during the study for the group 1: 0.3574638
## The probability of observing the event of interest in a subject during the study for the group 2: 0.2072824
##
## The number of events required for the group 1: 27
## The number of events required for the group 2: 16
## The total number of events required for the study: 43
##
## The sample sizes for the group 1: 75
## The sample sizes for the group 2: 75
## The total sample size of both groups combined: 150
With Interim Analysis
## N1_Event IA N2_Event IA N1_Patient IA N2_Patient IA NTotal IA N_Patient FU Power
## 62.00 3.00 134.00 15.00 149.00 32.00 91.24
| Method | Description |
|---|---|
| Log-Rank | “Average Hazard Ratio” – same as from univariate Cox Regression model |
| Linear-Rank (Weighted) | Gehan-Breslow-Wilcoxon, Tarone-Ware, Farrington-Manning, Peto-Peto, Threshold Lag, Modestly Weighted Linear-Rank (MWLRT) |
| Piecewise Linear-Rank | Piecewise Parametric, Weighted Piecewise Model (e.g. APPLE), Change Point Models |
| Combination | Maximum Combination (MaxCombo) Test Procedure |
| Survival Time | Milestone Survival (KM), Restricted Mean Survival Time, Landmark Analysis |
| Relative Time | Ratio of Times to Reach Event Proportion, Accelerated Failure Time Models |
| Others | Responder-Based, Frailty Models, Renyi Models, Net Benefit (Buyse) |
1. Concept: - The MaxCombo test is designed to handle multiple linear-rank tests simultaneously and to select the “best” test from the candidate tests. This approach helps in controlling Type I error rates while still allowing flexibility in the choice of statistical tests.
2. Test Variants: - Various forms of the
Fleming-Harrington family of tests (denoted as F-H(G) Tests) are used,
each specified by different parameterizations (G(p,q)) that
emphasize different portions of the survival curve. For example, some
may focus more on early failures while others on late failures.
| F-H (G) Tests | Proposal |
|---|---|
| G(0,1; 1,0) | Lee (2007) |
| G(0,0*; 0,1; 1,0) | Karrison (2016) |
| G(0,0; 0,1; 1,0; 1,1) | Lin et al (2020) |
| G(0,0; 0,0.5; 0.5,0; 0.5,0.5) | Roychoudhury et al (2021) |
| G(0,0; 0,0.5) | Mukhopadhyay et al (2022) |
| G(0,0; 0,0.5; 0.5,0) | Mukhopadhyay et al (2022) |
3. Common Usage: - Typically, 2-4 candidate tests are considered with Fleming-Harrington being popular due to its flexibility. It can accommodate Log-Rank and Peto-Peto tests, among others, allowing researchers to tailor the analysis to the specific characteristics of their survival data.
Issues with MaxCombo Tests
1. Type I Error and Estimand: - Critics point out that MaxCombo tests, while versatile, can sometimes lead to significant results even when the treatment effect is not better than the control across all times. This can mislead the conclusions about a treatment’s efficacy, especially if it is only effective late in the follow-up period (late efficacy).
2. Interpretability: - There are concerns about the interpretability of using an average hazard ratio as the estimand because it might not accurately reflect the dynamics of the treatment effect over time, particularly under non-proportional hazards scenarios.
3. Alternatives for Improvement: - Modifications to
the Fleming-Harrington weights (G(p,q) parameters) are
suggested to better handle scenarios with non-proportional hazards. For
example, changing the focus from early to late survival times can be
achieved by adjusting these parameters.
4. Communication of Results: - It’s recommended to use the MaxCombo for analytical purposes but to communicate the results using more interpretable measures such as the Restricted Mean Survival Time (RMST), which provides a direct, clinically meaningful measure of survival benefit.
Reference
Introduction
We will use \(\hat{\theta}\) as our test statistic, and reject \(H_0\) in favor of \(H_A\) if \(\hat{\theta}>k\) for some constant \(k\). - The significance level of the test, or Type I error rate, is \(\alpha=P\left(\hat{\theta}>k \mid \theta=\theta_0\right)\). 。 If \(Z=\frac{\hat{\theta}-\theta}{1 / \sqrt{d}}\), then we have \(\alpha=P\left(Z>\frac{k-\theta_0}{1 / \sqrt{d}}\right)\). 。 Let \(\Phi\left(z_\alpha\right)=1-\alpha\), then \(z_\alpha=\frac{k-\theta_0}{1 / \sqrt{d}}\) and hence \(k=\theta_0+\frac{z_\alpha}{\sqrt{d}}\). - The power of the test is given by \[ 1-\beta=P\left(\hat{\theta}>k \mid \theta=\theta_A\right)=P\left(Z>\frac{k-\theta_A}{1 / \sqrt{d}}\right) \] - Solving for \(d\) we have \[ \begin{gathered} z_{1-\beta}=-z_\beta=\sqrt{d}\left(k-\theta_A\right)=\sqrt{d}\left(\theta_0+\frac{z_\alpha}{\sqrt{d}}-\theta_A\right) \\ \Rightarrow d=\frac{\left(z_\beta+z_\alpha\right)^2}{\left(\theta_A+\theta_0\right)^2}=\frac{\left(z_\beta+z_\alpha\right)^2}{(\log \Delta)^2} . \end{gathered} \]
Probability of Event
Calculate patient/subject needed based on Probability of Event
We need to provide an estimate of the proportion \(\pi\) of patients who will die by the time of analysis. - If all patients entered at the same time, we would simply have \(\pi=1-S_\lambda(t)\), where \(t\) is the follow-up time. - However, patients actually enter over an accrual period of length \(a\) and then, after accrual to the trial has ended, they are followed for an additional time \(f\). - So a patient who enters at time \(t=0\) will have failure probability \(\pi(0)=1-S_\lambda(a+f)\) as this patient will have the maximum possible follow-up time \(a+f\). - Similarly, for any patient who enters at a time \(t \in[0, a]\), the failure probability \(\pi(t)=1-S_\lambda(a+f-t)\). - Assuming that the patients enter uniformly between times 0 and \(a\), the probability of death can be computed as \[ \pi=\int_0^a \frac{1}{a}\left[1-S_\lambda(a+f-t)\right] d t . \] - Assuming \(S_\lambda(t)=e^{-\lambda t}\), we have \[ \pi=1-\frac{1}{a \lambda}\left[e^{-\lambda f}-e^{-\lambda(a+f)}\right] . \]
Suppose that we are designing a Phase II oncology trial where we plan a 5% level (one-sided) test, and we need 80% power to detect a hazard ratio of 1.5. We can find the required number of deaths as follows:
## Log-mean based approach
## Expected number of events
ssc.onesample.logMean(HR = 1.5, sig.level = 0.05, power = 0.8)
##
## Hazard ratio: 1.5
## Alpha (one-sided): 0.05
## Power: 80 %
##
## Log-mean based approach
## Expected number of events: 38
We wanted to design a Phase II oncology trial where we plan a \(5 \%\) level (one-sided) test, and we need \(80 \%\) power to detect a hazard ratio of 1.5 .
Suppose that \(\lambda_0=0.15\), then we have \(\lambda_A=\lambda_0 / \Delta=0.1\). Assume accrual period \(a=2\) years and follow-up time \(f=3\) years. The probability of death under \(H_A: \lambda=0.1\) is computed as:
ssc.onesample.logMean2(HR = 1.5, sig.level = 0.05, power = 0.8, lambda=0.10, accrual=2, followup=3)
## Expected number of events: 38
## Probability of event: 0.329
## Expected number of patients: 116
Reference
Introduction
For fixed \(d, V=\sum t_i \sim \operatorname{Gamma}(d, \lambda)\) and it is known \({ }^2\) that \[ W=\frac{2 d \lambda}{\hat{\lambda}} \sim \chi_{2 d}^2 \] although this result is approximate for general censoring patterns. Under \(H_0: \lambda=\lambda_0\), we need to find a constant \(k\) such that \(\alpha=P\left(1 / \hat{\lambda}>k \mid \lambda=\lambda_0\right)=P\left(W>2 d k \lambda_0\right)\). Thus we have \(\chi_{2 d, \alpha}^2=2 d k \lambda_0\) and hence \(k=\frac{\chi_{2 d_d \alpha}^2}{2 d \lambda_0}\). The power of the test is given by \[ 1-\beta=P\left(1 / \lambda>k \mid \hat{\lambda}=\lambda_A=P\left(W>2 d k \lambda_A\right)\right) . \] We have \(\chi_{2 d, 1-\beta}^2=2 d k \lambda_A \Rightarrow \chi_{2 d, 1-\beta}^2=\frac{\chi_{2 d, \alpha}^2 \lambda_A}{\lambda_0}\), hence \(\Delta=\frac{\lambda_0}{\lambda_A}=\frac{\chi_{2 d, \alpha}^2}{\chi_{2 d, 1-\beta}^2}\). For specified \(\alpha\), power \(1-\beta\), and ratio \(\Delta\), we may solve this for the required number of deaths, \(d\).
\(\Delta\) can be computed using the following function:
expLikeRatio = function(d, alpha, pwr){
num = qchisq(alpha, df=(2*d), lower.tail=F)
denom = qchisq(pwr, df=(2*d), lower.tail=F)
Delta = num/denom
Delta
}
To get the number of deaths \(d\) for a specified \(\Delta\), we define a new function \(L R(d)=\frac{\chi_{2 d, \alpha}^2}{\chi_{2 d, 1-\beta}^2}-\Delta\). The solution for \(L R(d)=0\) is the required number of deaths and is computed as:
expLRdeaths = function(Delta, alpha, pwr){
LR = function(d, alpha, pwr, Delta){
expLikeRatio(d, alpha, pwr) - Delta
}
# Find the root for the function LR(d)
result = uniroot(f = LR, lower = 1, upper = 1000,
alpha = alpha, pwr = pwr, Delta = Delta)
result$root
}
Suppose that we are designing a Phase II oncology trial where we plan a 5% level (one-sided) test, and we need 80% power to detect a hazard ratio of 1.5. We can find the required number of deaths as follows:
ssc.onesample.LR(HR = 1.5, sig.level = 0.05, power = 0.8)
## Expected number of events: 37
Introduction